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Description 

Virtual assistant, which outputs audible information to 
a user of a data terminal by means of at least two 
5 electroacoustic converters, and method for presenting 
audible information of a virtual assistant 

The invention relates to a virtual assistant, which 
outputs audible information to a user of a data 
10 terminal by means of at least two electroacoustic 
converters, and a method for presenting audible 
information of a virtual assistant for a user of a data 
terminal . 

15 When using PC application programs, it is generally 
known that the user can make use of a virtual 
assistant, that is to say a computer-based help 
(program) that supports the user when carrying out the 
steps necessary to perform a task, or when the user 

20 wishes further explanations about the capabilities of 
the PC application program. Secondly, the user's 
attention is drawn to any incorrect inputs and the 
virtual assistant makes input suggestions to the user. 
The information provided by the virtual assistant is 

25 presented to the, user optically, that is to say by 
means of a display unit. 

In principle, this function of a virtual assistant 
which is helpful to the user can also be applied to 

30 mobile data terminals such as mobile phones or 
terminals that are known as Personal Digital Assistants 
(PDAs) . In this case, however, it is disadvantageous 
for the user that the extensive information presented 
by the virtual assistant must be displayed on a small 

35 display unit of the mobile data terminal. 
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Moreover, extensive information of a virtual assistant 
that is presented optically is difficult for the user 
of a data terminal to process whenever the user needs 
to concentrate on other optically presented information 
5 in the vicinity or acoustic information of a 
conversation partner at the same time. In this case it 
is expedient to provide the information presented by 
the virtual assistant of a data terminal for the data 
terminal user by means of an acoustic presentation. In 
10 this way, the data terminal user can better process the 
acoustically presented information and additional 
information optically presented simultaneously. 

On the other hand, data terminals or methods are known 
15 in which additional information is acoustically 
presented to the user of the data terminal or of the 
method. For instance, an assistant in a ticket machine 
guides the user of the ticket machine through the 
respective operating programs of the ticket machine by 
20 means of acoustic information. 

Since these ticket machines are often sited in a loud 
environment, it is difficult for the user of the ticket 
machine to follow the acoustic information output by 
25 the assistant of the ticket machine. It is even more 
difficult to follow acoustic information that is 
simultaneously acting on a user from two different 
signal sources. 

30 So-called binaural technology has been the subject of 
research for some time now. For example, an 
introduction to binaural technology is described under 
the title: "An introduction to binaural technology" by 
J. Blauert (1996) in Binaural and Spatial Hearing in 

35 Real and Virtual Environments, edited by R. Gilkey & T. 
Anderson, pages 593-609, Lawrence Erlbaum, USA-Hilldale 
NJ. 
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With the aid of binaural technology, using signal 
processing of the sound information, the listener can 
assign the sound-generating source to any positions of 
the surrounding space. The position of the listener 
5 here, and of the electroacoustic converters outputting 
the acoustic information respectively, remains 
spatially fixed. By means of suitable signal processing 
of the sound information, it is then possible for 
example to awaken in the listener the subjective 
10 impression that the sound-generating source is turning 
around him, or is coming toward him, or is moving away 
from " him. By signal processing of the sound 
information, the sound-generating source can therefore 
be spatially positioned anywhere. 

15 

It is therefore the object of the present invention to 
develop a technical solution for the user of a data 
terminal, in which the acoustic information output by 
the virtual assistant of the data terminal can be 
20 better separated, in terms of the user's perception, 
from other sound sources that are likewise acting on 
the data terminal user. 

The object is achieved on the basis of the virtual 
25 assistant defined in the preamble of claim 1 by the 
features set out in the characterizing part of claim 1, 
and on the basis of the method defined in the preamble 
of claim 9 by the features set out in the 
characterizing part of claim 9. Advantageous 
30 refinements of the invention are set out in the 
subclaims . 

According to the invention, a virtual assistant which 
outputs audible information to a data terminal user by 
35 means of at least two electroacoustic converters can be 
spatially positioned by the user in order to achieve a 
better spatially acoustic separation between the 
information output by means of the electroacoustic 
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converters and additional information output by at 
least one further sound source. 

One advantage of the invention is the utilization of 
5 the spatial positioning of sound sources by means of 
signal processing of the sound information of the 
virtual assistant of the data terminal, or its locating 
by. the data terminal user respectively. For the data 
terminal user, said sound information of the virtual 
10 assistant can be better perceived separately from 
ambient noises. 

Furthermore, the sound information of the virtual 
assistant can be supplied to the data terminal user in 

15 a targeted manner from one direction, while the user is 
simultaneously holding a conversation with another 
conversation partner in the room. Here, too, it is 
possible to achieve a good spatially acoustic 
separation between the sound information acting on the 

20 user from the virtual assistant and from the 
conversation partner. This enables the user to receive 
and process both the information coming from the 
virtual assistant the information coming from the 
conversation partner. The simultaneous receiving and 

25 processing of both the information coming from the 
virtual assistant and the information coming from the 
conversation partner is however at least facilitated 
for the user. 

30 A further advantage emerges when, in addition to the 
sound information coming from the virtual assistant and 
the ambient noises originating from further sound 
sources present in the vicinity of the user, also 
information that is additionally optically presented 

35 simultaneously acts on the data terminal user. In this 
case, too, the data terminal user can better receive 
and process the information coming from the various 
sources . 
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Further advantages of the invention emerge from the 
description below, in which the invention is explained 
with reference to two exemplary embodiments. 

5 In the first exemplary embodiment, a pedestrian is 
situated in road traffic. The pedestrian is laden with 
heavy shopping bags. The pedestrian would like to 
conduct a phone call using his data terminal in the 
form of a mobile phone. The mobile phone is switched 

10 on, but is stowed away in one of his shopping bags and 
therefore cannot be readily located. The pedestrian is 
wearing a light headphones and microphone set however. 
Integrated in the headphones and microphone set are two 
electroacoustic converters for outputting sound 

15 information. Like the mobile phone, the headphones and 
microphone set is connected to a radio module, for 
example to a Bluetooth radio module, for short-range 
data exchange between the headphones and microphone set 
and the mobile phone. 

20 

The pedestrian, user of the headphones and microphone 
set and of the mobile phone respectively, activates the 
headphones and microphone set and thus enables data 
exchange between the headphones and microphone set and 

25 the mobile phone. The user speaks the word "DIAL" into 
the headphones and microphone set, whereupon the 
virtual assistant of the mobile phone responds with 
"PLEASE SAY THE NAME". The user says the name of the 
person he wishes to call. Since the user is moving in 

30 an environment with a high noise level, the mobile 
phone does not recognize the name of the person to be 
called with sufficient accuracy. The mobile phone 
processes the name entered by the user and compares it 
with names stored in the internal phone directory of 

35 the mobile phone. The mobile phone recognizes the name 
spoken as "SCHMITZER" or "SCHNITZLER" . Output of the 
two names to the display unit of the mobile phone and 
the subsequent request to the user to select one of 
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these names is of no use to the user. This is because, 
as already mentioned, 
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the user's mobile phone is hidden in one of the 
shopping bags in a place that is difficult to access. 
On the other hand, the mobile phone has recognized the 
operation of the mobile phone by the user via the 
5 headphones and microphone set, so the mobile phone 
instructs the virtual assistant of the mobile phone to 
output all similarly sounding names to the user by 
means of the virtual assistant using the headphones and 
microphone set. For example, the user hears the 
10 following words of his virtual assistant via the 
headphones and microphone set: "THE NAME WAS NOT 
CLEARLY RECOGNIZED". "PLEASE SELECT ONE OF THE 
FOLLOWING OPTIONS". "SCHMITZER" or after a brief pause 
"SCHNITZLER" . 

15 

Despite the loud ambient noises, the user recognizes 
both the options offered by the virtual assistant 
because binaural technology is used during the output 
of the sound information of the virtual assistant of 

20 the mobile phone by means of the electroacoustic 
converters. The binaural technology enables the 
targeted signal processing of the sound information in 
the mobile phone. When the sound information is played 
back by the virtual assistant using the headphones and 

25 microphone set, the mobile phone user can perceive a 
clear local attribution of the sound information output 
by the virtual assistant. In accordance with a user 
preset, in the mobile phone the sound information is 
processed using signal technology in such a way that 

30 the mobile phone user locates the sound information 
presented by the virtual assistant as if it were coming 
from the vicinity of the head. The sound information is 
"whispered" into the user's ear over his shoulder from 
behind . 

35 

The position of the virtual assistant, or the position 
from which the sound information output by the virtual 
assistant is perceived respectively, can be changed as 
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desired by the mobile phone user, for example by means 
of an electromechanical input device known per se. 
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The electromechanical input device is for example a 
ball in a socket. The rotations of the ball produced by 
the user are detected by sensors. Alternatively, the 
positioning of the virtual assistant is performed in a 
5 manner known per se by means of voice commands or by 
means of inputs on a touch-sensitive display unit of 
the mobile phone. 

If the mobile phone has a head position sensor which 
10 detects the head movements of the mobile phone user, 
for example using a rotational rate sensor or a 
magnetic field sensor, it is furthermore possible for 
the selected position of the virtual assistant to be 
retained even if the head movements are taken into 
15 account during the signal processing of the sound 
information. 

By means of the preset positioning of the virtual 
assistant, or respectively the ability of the user to 
20 change its position as desired, the user can both 
operate the mobile phone in a simple manner using voice 
commands to establish an outgoing connection as well as 
attentively perceive ambient noises, such as loud calls 
or sounding of horns etc. 

25 

To finish the selection of the names "SCHMITZER" or 
"SCHNITZLER" presented by the virtual assistant in 
order to establish an outgoing connection, the user 
responds to the name "SCHMITZER" by speaking a "NO" 
30 into the headphones and microphone set and by 
responding "YES" for the name "SCHNITZLER". The mobile 
phone recognizes the name "SCHNITZLER" and establishes 
an outgoing call. 

35 In the second exemplary embodiment, a teleconferencing 
situation is described. Taking part in the 
teleconference are a plurality of people who for the 
most part speak and understand different languages. The 
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at individual tables spread throughout the 
teleconferencing room, with each person having their 
own display. If one participant starts to speak, then 
the data terminal in the form of a teleconferencing 
5 system displays this participant on a large screen on a 
side wall of the teleconferencing room, so that the 
other participants can also observe the facial 
expressions and gestures of this participant. 

10 Secondly, his speech is output via electroacoustic 
converters in the form of loudspeakers which are 
connected to the teleconferencing system. 

At the same time, the contributions of the speaking 

15 participant are simultaneously interpreted into the 
languages of the other participants, and the 
translation is made available to the participants in 
the form of sound information via a headphones and 
microphone set in which two electroacoustic converters 

20 for outputting sound information are integrated. To 
offer the participants the option of attentively 
following the speech both in the language of the 
participant speaking and in the language of the 
simultaneous interpretation, the simultaneous 

25 interpretation is output by the teleconferencing system 
using a virtual assistant so that the other 
participants can hear it. The virtual assistant can be 
positioned anywhere in the room by each teleconference 
participant by entering the respective key combinations 

30 into the teleconferencing system. 

Here, too, the positioning of the virtual assistant, or 
the spatially acoustic perception of the sound 
information output by the virtual assistant by the 
35 individual participants respectively, is achieved by 
means of signal processing of the sound information in 
the teleconferencing system. The participants position 
the virtual assistant in such a way that the 
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participants perceive the output of the sound 

information by the virtual assistant as being 

transmitted over the shoulder from behind and coming 
from the vicinity of the head. By virtue of 
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this positioning of the virtual assistant, a good 
spatially acoustic separation between the speech 
transmitted via loudspeakers and the simultaneous 
interpretation of the speech is achieved, so that the 
5 participants can readily follow both the speech 
transmitted via loudspeakers and the simultaneous 
interpretation, and can still attentively observe the 
facial expressions and gestures of the participant 
speaking. That is to say, the participants can highly 
10 attentively follow a plurality of information streams 
at the same time. 

If one participant already knows what one of his own 
delegation is going to say, then said participant can 
15 have the teleconferencing system acoustically give him 
further information via the virtual assistant, for 
example about the schedule for the day, background 
information about the other participants, or 
information about the participant's hotel. 

20 

The examples given are not exhaustive. The concept of 
the spatially acoustic separation of sound information 
which is output to a data terminal user via a virtual 
assistant and additional simultaneously audible and/or 

25 visible information which is important to the user can 
be applied to further examples, in particular in cases 
where mobile communication terminals are employed by a 
user. Travel guides are cited here by way of example, 
wherein the travel guide explains certain exhibits of a 

30 museum to visitors in the language of the country; the 
visitors are able to listen via their UMTS mobile phone 
to a simultaneous interpretation of the explanations of 
the travel guide with good spatially acoustic 
separation via a virtual assistant, and optionally can 

35 attentively follow additional optical information 
relating to the exhibits on the display unit of their 
UMTS mobile phone at the same* time. 



